Multiple regression and interactions

Elizabeth King
Kevin Middleton

Multiple regression

Snake diets

  • 112 species of snakes
  • Ambush or active foragers
  • Snout-vent length (SVL)
  • Relative prey mass (RPM): 0 \(\rightarrow\) 1
  • Data from Glaudas et al. (2019)
Snakes |> count(ForagingMode)
# A tibble: 2 × 2
  ForagingMode     n
  <fct>        <int>
1 Active          69
2 Ambush          43

Exploratory visualization

ggplot(Snakes, aes(SVL, RPM, color = ForagingMode)) +
  geom_point(size = 4) +
  scale_colour_manual(values = c("red", "blue"))

Transformation and visualization

Snakes <- Snakes |> 
  mutate(log10SVL = log10(SVL))

Model statements

\(RPM\) follows a normal distribution:

\[RPM \sim Normal(\mu, \sigma)\]

\(\mu\) is a linear function of an intercept \(a\) for each level of ForagingMode and a slope (\(b\)) associated with log10SVL that is shared by both levels of ForagingMode:

\[\mu = a[ForagingMode] + b \cdot log10SVL\]

Corresponds to the additive model: RPM ~ ForagingMode + log10SVL - 1

Priors

The scale of RPM is pretty small, so the intercepts and slope will probably also be small:

\[a[ForagingMode] \sim Normal(0, 0.5)\]

\[b \sim Normal(0, 0.5)\]

\[\sigma \sim HalfNormal(0, 0.5)\]

Recode ForagingMode as integer

Snakes <- Snakes |> 
  mutate(ForagingMode = as.integer(ForagingMode))
head(Snakes)
# A tibble: 6 × 4
  ForagingMode   SVL    RPM log10SVL
         <int> <dbl>  <dbl>    <dbl>
1            2  83.7 0.301      1.92
2            1  48.7 0.33       1.69
3            2 174.  0.075      2.24
4            2  44   0.33       1.64
5            1  87.4 0.0237     1.94
6            2  83.2 0.19       1.92

Fitting the model

refresh = 1000 makes stan’s output a little less verbose

fm <- ulam(
  alist(
    RPM ~ dnorm(mu, sigma),
    mu <- a[ForagingMode] + b * log10SVL,
    a[ForagingMode] ~ dnorm(0, 0.5),
    b ~ dnorm(0, 0.5),
    sigma ~ dhalfnorm(0, 0.5)
  ),
  data = Snakes,
  chains = 4,
  iter = 1e4,
  refresh = 1000
)

Fitting the model

Running MCMC with 4 sequential chains, with 1 thread(s) per chain...

Chain 1 Iteration:    1 / 10000 [  0%]  (Warmup) 
Chain 1 Iteration: 1000 / 10000 [ 10%]  (Warmup) 
Chain 1 Iteration: 2000 / 10000 [ 20%]  (Warmup) 
Chain 1 Iteration: 3000 / 10000 [ 30%]  (Warmup) 
Chain 1 Iteration: 4000 / 10000 [ 40%]  (Warmup) 
Chain 1 Iteration: 5000 / 10000 [ 50%]  (Warmup) 
Chain 1 Iteration: 5001 / 10000 [ 50%]  (Sampling) 
Chain 1 Iteration: 6000 / 10000 [ 60%]  (Sampling) 
Chain 1 Iteration: 7000 / 10000 [ 70%]  (Sampling) 
Chain 1 Iteration: 8000 / 10000 [ 80%]  (Sampling) 
Chain 1 Iteration: 9000 / 10000 [ 90%]  (Sampling) 
Chain 1 Iteration: 10000 / 10000 [100%]  (Sampling) 
Chain 1 finished in 1.6 seconds.
Chain 2 Iteration:    1 / 10000 [  0%]  (Warmup) 
Chain 2 Iteration: 1000 / 10000 [ 10%]  (Warmup) 
Chain 2 Iteration: 2000 / 10000 [ 20%]  (Warmup) 
Chain 2 Iteration: 3000 / 10000 [ 30%]  (Warmup) 
Chain 2 Iteration: 4000 / 10000 [ 40%]  (Warmup) 
Chain 2 Iteration: 5000 / 10000 [ 50%]  (Warmup) 
Chain 2 Iteration: 5001 / 10000 [ 50%]  (Sampling) 
Chain 2 Iteration: 6000 / 10000 [ 60%]  (Sampling) 
Chain 2 Iteration: 7000 / 10000 [ 70%]  (Sampling) 
Chain 2 Iteration: 8000 / 10000 [ 80%]  (Sampling) 
Chain 2 Iteration: 9000 / 10000 [ 90%]  (Sampling) 
Chain 2 Iteration: 10000 / 10000 [100%]  (Sampling) 
Chain 2 finished in 1.5 seconds.
Chain 3 Iteration:    1 / 10000 [  0%]  (Warmup) 
Chain 3 Iteration: 1000 / 10000 [ 10%]  (Warmup) 
Chain 3 Iteration: 2000 / 10000 [ 20%]  (Warmup) 
Chain 3 Iteration: 3000 / 10000 [ 30%]  (Warmup) 
Chain 3 Iteration: 4000 / 10000 [ 40%]  (Warmup) 
Chain 3 Iteration: 5000 / 10000 [ 50%]  (Warmup) 
Chain 3 Iteration: 5001 / 10000 [ 50%]  (Sampling) 
Chain 3 Iteration: 6000 / 10000 [ 60%]  (Sampling) 
Chain 3 Iteration: 7000 / 10000 [ 70%]  (Sampling) 
Chain 3 Iteration: 8000 / 10000 [ 80%]  (Sampling) 
Chain 3 Iteration: 9000 / 10000 [ 90%]  (Sampling) 
Chain 3 Iteration: 10000 / 10000 [100%]  (Sampling) 
Chain 3 finished in 1.5 seconds.
Chain 4 Iteration:    1 / 10000 [  0%]  (Warmup) 
Chain 4 Iteration: 1000 / 10000 [ 10%]  (Warmup) 
Chain 4 Iteration: 2000 / 10000 [ 20%]  (Warmup) 
Chain 4 Iteration: 3000 / 10000 [ 30%]  (Warmup) 
Chain 4 Iteration: 4000 / 10000 [ 40%]  (Warmup) 
Chain 4 Iteration: 5000 / 10000 [ 50%]  (Warmup) 
Chain 4 Iteration: 5001 / 10000 [ 50%]  (Sampling) 
Chain 4 Iteration: 6000 / 10000 [ 60%]  (Sampling) 
Chain 4 Iteration: 7000 / 10000 [ 70%]  (Sampling) 
Chain 4 Iteration: 8000 / 10000 [ 80%]  (Sampling) 
Chain 4 Iteration: 9000 / 10000 [ 90%]  (Sampling) 
Chain 4 Iteration: 10000 / 10000 [100%]  (Sampling) 
Chain 4 finished in 1.5 seconds.

All 4 chains finished successfully.
Mean chain execution time: 1.5 seconds.
Total execution time: 6.2 seconds.

Examining the stan code

SVL is passed to stan even though it is not used.

  • Limit the data to the variables that are used
  • stan will error if there are characters or factors (but not in brms or rstanarm)
stancode(fm)
data{
    vector[112] SVL;
    vector[112] RPM;
    vector[112] log10SVL;
    int ForagingMode[112];
}
parameters{
    vector[2] a;
    real b;
    real<lower=0> sigma;
}
model{
    vector[112] mu;
    sigma ~ normal( 0 , 0.5 );
    b ~ normal( 0 , 0.5 );
    a ~ normal( 0 , 0.5 );
    for ( i in 1:112 ) {
        mu[i] = a[ForagingMode[i]] + b * log10SVL[i];
    }
    RPM ~ normal( mu , sigma );
}

Inspecting the chains: traceplot

traceplot(fm)

Inspecting the chains: Rank histogram

trankplot(fm)

Inspecting the output

precis(fm, depth = 2)
             mean         sd        5.5%      94.5%    n_eff    Rhat4
a[1]  0.129745646 0.09577204 -0.02450677 0.28451505 4776.263 1.000759
a[2]  0.223038679 0.10145103  0.06108287 0.38625743 4807.839 1.000863
b     0.006692846 0.05278036 -0.07756007 0.09150789 4746.225 1.000835
sigma 0.119571193 0.00814419  0.10730789 0.13332405 6906.534 1.000315

Posterior distributions

Use tidy_draws() to extract samples into an orderly structure.

  • Unnest the levels of a[ForagingMode]
library(tidybayes)
library(tidybayes.rethinking)

post <- tidy_draws(fm)
head(post)
# A tibble: 6 × 14
  .chain .iter…¹ .draw `a[1]` `a[2]`       b sigma  lp__ accep…² treed…³ steps…⁴
   <int>   <int> <int>  <dbl>  <dbl>   <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>
1      1       1     1 0.0526  0.164  0.0318 0.120  180.   0.998       6  0.0630
2      1       2     2 0.0516  0.186  0.0542 0.121  178.   0.982       3  0.0630
3      1       3     3 0.0407  0.175  0.0484 0.121  180.   1           2  0.0630
4      1       4     4 0.220   0.364 -0.0574 0.113  179.   0.987       6  0.0630
5      1       5     5 0.206   0.267 -0.0324 0.119  181.   1.00        4  0.0630
6      1       6     6 0.210   0.262 -0.0360 0.119  179.   0.838       2  0.0630
# … with 3 more variables: divergent__ <dbl>, n_leapfrog__ <dbl>,
#   energy__ <dbl>, and abbreviated variable names ¹​.iteration, ²​accept_stat__,
#   ³​treedepth__, ⁴​stepsize__
colnames(post)
 [1] ".chain"        ".iteration"    ".draw"         "a[1]"         
 [5] "a[2]"          "b"             "sigma"         "lp__"         
 [9] "accept_stat__" "treedepth__"   "stepsize__"    "divergent__"  
[13] "n_leapfrog__"  "energy__"     

Plotting posteriors

Use ggdist::stat_halfeye() to plot density + intervals

post |> 
  dplyr::select(`a[1]`, `a[2]`, b) |>
  pivot_longer(cols = everything(),
               names_to = "Parameter",
               values_to = "Estimate") |> 
  ggplot(aes(x = Estimate, y = Parameter)) +
  stat_halfeye(point_interval = "median_hdi", .width = c(0.50, 0.89)) +
  scale_y_discrete(limits = rev) +
  labs(title = "50% and 89% HDPI of the Median")

Plotting posteriors

Posterior predictions

Sample from the posterior

post_sub <- post |> 
  slice_sample(n = 100) |> 
  dplyr::select(`a[1]`, `a[2]`, b)

post_sub
# A tibble: 100 × 3
    `a[1]`  `a[2]`       b
     <dbl>   <dbl>   <dbl>
 1 -0.120  -0.0136  0.148 
 2  0.355   0.431  -0.115 
 3  0.0643  0.165   0.0379
 4  0.0537  0.153   0.0460
 5  0.218   0.318  -0.0513
 6  0.299   0.424  -0.0879
 7 -0.0665  0.0721  0.0986
 8  0.194   0.246  -0.0179
 9  0.146   0.266  -0.0176
10  0.211   0.313  -0.0491
# … with 90 more rows

Plotting posterior predictions

Each pair of lines has the same slope

ggplot() +
  geom_point(data = Snakes,
             aes(log10SVL, RPM, color = factor(ForagingMode)),
             size = 4) +
    scale_colour_manual(values = c("red", "blue"),
                        name = "Foraging Mode",
                        labels = c("Active", "Ambush")) +
  geom_abline(data = post_sub, aes(slope = b, intercept = `a[1]`),
              color = "red", alpha = 0.25) +
  geom_abline(data = post_sub, aes(slope = b, intercept = `a[2]`),
              color = "blue", alpha = 0.25)

Plotting posterior predictions

Interactions

Interactions are multiplications of predictors

  • \(x_1\) and \(x_2\) are continuous: \(\beta_1 \cdot x_1 \cdot x_2\)
  • One \(x\) is categorical: 0 and 1 coding turn predictors “off” and “on”

Model statements

\(RPM\) follows a normal distribution:

\[RPM \sim Normal(\mu, \sigma)\]

\(\mu\) is a linear function of an intercept \(a\) for each level of ForagingMode and a slope (\(b\)) associated with log10SVL estimated separately for each level of ForagingMode:

\[\mu = a[ForagingMode] + b[ForagingMode] \cdot log10SVL\]

Corresponds to the interaction model: RPM ~ ForagingMode * log10SVL

Fitting the model

b[ForagingMode] * log10SVL directly estimates both slopes

  • Consider the prior
fm <- ulam(
  alist(
    RPM ~ dnorm(mu, sigma),
    mu <- a[ForagingMode] + b[ForagingMode] * log10SVL,
    a[ForagingMode] ~ dnorm(0, 0.5),
    b[ForagingMode] ~ dnorm(0, 0.5),
    sigma ~ dhalfnorm(0, 0.5)
  ),
  data = Snakes,
  chains = 4,
  iter = 1e4,
  refresh = 1000
)

Inspecting the chains: traceplot

traceplot(fm)

Inspecting the chains: Rank histogram

trankplot(fm)

Inspecting the output

precis(fm, depth = 2)
              mean          sd         5.5%     94.5%     n_eff    Rhat4
a[1]   0.322810359 0.132931149  0.113707780 0.5336861  8082.068 1.000536
a[2]   0.008534392 0.143786309 -0.219674190 0.2399464  7869.541 1.000300
b[1]  -0.100681558 0.073510132 -0.217729660 0.0144544  8110.778 1.000518
b[2]   0.120132571 0.075478382 -0.001865046 0.2400267  7837.627 1.000256
sigma  0.117222180 0.008035693  0.104990725 0.1306392 11108.783 1.000077

Posterior distributions

post <- tidy_draws(fm)
post |> 
  dplyr::select(`a[1]`, `a[2]`, `b[1]`, `b[2]`) |>
  pivot_longer(cols = everything(),
               names_to = "Parameter",
               values_to = "Estimate") |> 
  ggplot(aes(x = Estimate, y = Parameter)) +
  stat_halfeye(point_interval = "median_hdi", .width = c(0.50, 0.89)) +
  scale_y_discrete(limits = rev) +
  labs(title = "50% and 89% HDPI of the Median")

Posterior distributions

Posterior predictions

Sample from the posterior

post_sub <- post |> 
  slice_sample(n = 100) |> 
  dplyr::select(`a[1]`, `a[2]`, `b[1]`, `b[2]`)

ggplot() +
  geom_point(data = Snakes,
             aes(log10SVL, RPM, color = factor(ForagingMode)),
             size = 4) +
    scale_colour_manual(values = c("red", "blue"),
                        name = "Foraging Mode",
                        labels = c("Active", "Ambush")) +
  geom_abline(data = post_sub, aes(slope = `b[1]`, intercept = `a[1]`),
              color = "red", alpha = 0.25) +
  geom_abline(data = post_sub, aes(slope = `b[2]`, intercept = `a[2]`),
              color = "blue", alpha = 0.25)

Posterior predictions

Next steps

  • Are these models “good”?
    • Posterior predictive bands
  • Are parameter estimates credibly different from zero?
    • Regions of practical equivalence
  • Which model is “better”?
    • Model comparison

References

Glaudas, X., K. L. Glennon, M. Martins, L. Luiselli, S. Fearn, D. F. Trembath, D. Jelić, and G. J. Alexander. 2019. Foraging Mode, Relative Prey Size and Diet Breadth: A Phylogenetically Explicit Analysis of Snake Feeding Ecology. J. Anim. Ecol. 88:757–767.